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Abstract 

This  paper  is  concerned  with  errors  in  the  observed  values  of 
the  independent  variables  of  a linear  regression.  We  propose  sensi- 
tivity coefficients  to  measure  the  effects  of  these  errors  and  show 
that  they  can  easily  be  coin^uted  from  quantities  ordinarily  calculated 
in  performing  the  regression. 
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Sensitivity  Coefficients 
for  the  Effects  of  Errors  in 
the  Independent  Variables  in 
a Linear  Regression 

G.  W.  Stewart 

1.  Introduction 

In  this  paper  we  shall  be  concerned  with  the  regression  problem 

2 

minimize  , 

where  X is  an  n x p matrix  of  rank  p,  is  an  n-vector,  and  |MI 
denotes  the  usual  Euclidean  vector  norm.*  The  problem  has  the  unique 
solution 

(1-1) 

t -IT 

where  X = fX'X)  X is  the  pseudo- inverse  of  X. 

Although  classical  regression  theory  concerns  itself  with  the  sta- 
tistical analysis  of  errors  in  the  vector  y,  it  frequently  happens  that 
the  design  matrix  X is  itself  contaminated  with  errors,  so  that  one 
is  effectively  working  with  a perturbed  matrix  X + E.  For  example,  the 
columns  of  X may  be  measured  by  means  of  some  instrument  for  which  the 
originator  of  the  problem  can  only  give  crude  error  estimates. 

In  this  case  the  data  analyst  is  faced  with  the  problem  of  deciding  when 

the  effects  of  the  errors  can  be  ignored.  The  problem  is  especially 

n 

In  the  sequel  l|*|l  will  also  denote  the  spectral  matrix  norm  defined  by 
||A||  = sup  {||^||:||x||=l}.  See  [3]  for  details. 
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critical  in  general  purpose  regression  routines,  where  it  is  desirable 
to  provide  the  user  with  a set  of  easily  interpretable  numbers  that 
indicate  the  magnitude  of  the  effects  of  the  errors. 

A partial  solution  is  provided  by  the  perturbation  theory  for  the 

regression  problem  (for  a survey  of  this  theory  see  [4]).  This  theory 

bounds  perturbations  in  p in  terms  of  ||E||  and  of  the  "condition 
+ 

number"  k = l|XI|||X  || . Although  the  results  of  this  theory  shed  considerable 
light  on  the  behavior  of  regression  problems  under  perturbations  in  X, 
they  are  unsatisfactory  in  practice  for  two  reasons.  First  they  bound 
only  the  norm  of  the  perturbation  in  so  that  a large  perturbation  in 
one  conponent  can  conceal  the  fact  that  the  others  have  small  perturba- 
tions. More  important,  they  are  not  scale  invariant;  changing  the  scale 
of  the  columns  of  X will  change  k,  even  though  the  statistical  problem 
is  essentially  unaltered.  This  phenomenon  makes  the  results  quite  diffi- 
cult to  interpret. 

Taking  a different  approach,  Beaton,  Rubin,  and  Barone  [1]  have 
derived  measures  of  sensitivity  that  to  some  extent  answer  the  above  ob- 
jections. However,  it  is  assumed  that  the  errors  are  unbiased  and  n is 
large.  Swindel  and  Bauer  [5]  have  derived  a useful  bound  for  the  relative 
bias  in  which  measures  the  relative  effects  of  perturbations  in  X 
compared  to  the  usual  statistical  errors  in  y.  In  this  paper  we  shall 
derive  coefficients  that  measure  the  sensitivity  of  p^  to  changes 

in  column  j of  X.  Specifically  y. . is  the  norm  of  the  Frechet 

^ I-  ' 1 j 


derivative  of  regarded  as  a function  of  the  j-th  column  of  X. 

If  e is  the  norm  of  the  perturbation  of  the  i-th  column  of  X,  then 
Y— e will  be  an  asymptotic  bound  on  the  perturbation  induced  in  p^.* 


2.  Derivation  of  the  Coefficients 

Although  it  is  in  principle  jxjssible  to  calculate  the  required 
derivatives  directly  from  the  normal  equations  (X'X)P  = prefer 

to  approach  the  problem  through  a first  order  perturbation  theorem  that 
is  useful  in  its  own  right. 

Tlieorem  2.1.  In  the  notation  of  the  last  section,  let  p be  the 
solution  of  the  regression  problem  (1.1),  and  let  r = v - be  'he 
corresponding  residual  vector.  Let  E be  an  n x p matrix.  If 


< 1 . 


then  X + E has  rank  p so  that  there  is  a unique  solution  £ of  the 
problem 


minimize  ||j^- (X+E)jpir 


Moreover,  as  E approaches  zero 


5 = 6 - X EB  + (X'X)’H;’^r  + 0(||EI|^) 


Proof.  For  a proof  that  (2.1)  implies  that  X E has  full  rank,  see 
(4).  This  implies  that  for  all  sufficiently  small  E,  £ exists  and  is 
given  by 

n 

Here,  and  throughout  this  note,  the  term  asymptotic  refers  to  behavior  for 
small  e,  not  large  n. 
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Now  it  is  well  known  [3]  that  if  P is  sufficiently  small,  then  1 + P 
is  nonsingular  and 

- i ■ P * o(IW^)  • 

It  follows  that 

[ (X+E)’ QC+E)]'^  = (X'X+X'E+E’X+E'E)'^ 

= {X'X[I+rX'X)’^(X'E+E'X+E'E)]} 

= U-CX’X)’^(X’E^E'X)](X'X)'^  " 0(|IE||2)  . 

Hence 

i ■ - Q('X)'*X’Ee('X)-lx'^ 

- «'iS)'‘E'XQ('X)'N’X,  . 0(||E||2) 

= ^ - x'''ffi  + fX'X)‘^E’(y-)ffi) 

A/  »Vy  rv.Aw  >Sy  <Vy  rs^  >0 

= 3 - X'*'®  + fX'X)'^E^r  .D 

Theorem  2.1  inmediately  gives  an  expression  for  the  Frechet  deriva- 
tive of  3^  regarded  as  a function  of  the  j-th  column  of  X. 

Corollary  2.2.  Let  3^  = fjj(Xj),  where  denotes  the  j-th  column 
of  X.  Then 

= ■PiS’Fx'^  + e!CX'X)’^e.r^  . 


i 


{2.2) 
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Proof.  The  Frechet  derivative  is  the  unique  row  vector 


satisfyiiig 


(2.3)  f..(x.+e)  = p.  + Df..e  + Odlell  ) . 

Let  e.  denote  the  j-th  unit  vector.  Then  a perturbation  e in  x. 

^3 

T 

amounts  to  adding  to  X the  matrix  E = ee-.  Hence  from  Theorem  2.1 

fijCjj-e)  = e!(j^-X  (ee!)^/CX'X)'^Ce^!)'r]  - 0(|!e!l^) 

= p-  - p.e!X  e + el  (XX)'^e.e'r  + 0(He!l^) 

1 3 1 1 3 ^ 

= - (?.^eix  -el  Q(-X)'Vr-)e  - 0(Hel'^)  . 

which  shows  that  f^^  defined  by  (2.2)  satisfies  (2.3).c 
Corollary  2.3.  Let  C = (X'X)'^  Then 


Proof.  The  vector  r is  orthogonal  to  the  column  space  of  X, 
is  th 
gonal,  and 


+ T + 

which  is  the  same  as  the  row  space  of  X . Hence  e^X  and  r'  are  ortho- 


llDf..||^  = IP-l^lle.X^II^  + cj.llrll^  . 

t 2 

ITie  proof  will  be  complete  if  we  can  show  that  ||ejX  ||  = But 

l|e|X'''||^  = ||e!(X'X)‘^X'||  = el  (X'X) ‘^X'X(X'X)  *^e.  = e!(X'X)‘^e.  = c..  .□ 
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3.  Applications 

It  should  be  observed  that  the  numbers  y--  can  be  calculated  from 

11 

the  quantities  that  are  usually  computed  in  the  course  of  the  regression. 

For  example,  if  sweep  techniques  (e.g.  see  [2])  have  been  used  to  solve 

2 

the  problem,  the  numbers  c^j  will  be  available,  as  will  I'rll  , since  it 
is  simply  the  residual  sum  of  squares. 

The  results  are  readily  interpretable.  If  a bound  cj  on  the  norm 
of  the  error  in  the  j-th  column  is  available,  then  the  error  6^  induced 
in  is  asymptotically  bounded  by  TjjCj-  If  this  number  is  too  large, 
the  problem  requires  further  study.  In  interpreting  the  results  it  should 
always  be  borne  in  mind  that  is  a first  order  bound.  A large 

value  is  a signal  that  something  may  be  wrong,  but  if  is  so  large 

that  the  first  order  approximation  is  not  applicable,  then  the  difficul- 
ties may  turn  out  to  be  illusory. 

A particularly  attractive  feature  of  the  asymptotic  bounds  is  that, 
since  they  deal  with  indivual  components  of  p,  their  interpretation  is 
independent  of  the  scaling  of  the  columns  of  X.  This  is  particularly 
apparent  if  the  bounds  are  cast  in  terms  of  relative  error  in  the  form 


Now 


Y-  • x.| 

TqT 


T I*  (llrD 

iqr 


(3.11 


1 


and  each  parenthesized  term  in  the  right  hand  side  of  (3.1)  is  easily  seen 
to  be  invariant  under  scaling  of  the  columns  of  X.  Indeed,  in  some 
applications  where  the  p^'s  are  known  to  be  bounded  away  from  zero  it 
may  be  more  appropriate  to  report  Y^jl|x^ll/|p^|  than  • 

In  deriving  our  bounds,  we  have  used  the  Cauchy -Schwarz  inequality 
i|x*j^||  s ||x||!|^!l,  an  inequality  which  is  usually  pessimistic,  since  it  must  j 

account  for  the  worst  case  where  x and  ^ are  dependent.  If  we  are  j 

willing  to  assume  more  about  the  perturbation  e in  x ^ , then  we  may  be 
able  to  say  more.  For  exanijile,  we  have  the  following  consequence  of 
Corollary  2.3. 

2 

Corollary  5.1.  Let  e € N(0,a  I).  Then  Df^jC  is  normally  distri- 
buted with  mean  zero  and  standard  deviation  y. .a. 

ij 
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